title: Gush: A stack based language eventually for genetic programming
date: 2017-04-06 14:15
author: Christine Lemmer-Webber
tags: gush, foss, genetic programming, anti-abuse
slug: gush-intro
---
(This blogpost was going to be just about a project I'm working on Gush,
but instead it's turned into a whole lot of backstory, and then a short
tutorial about [Gush](https://gitlab.com/dustyweb/gush). If you're just
interested in the tutorial, skip to the bottom I guess.)

I recently wrote about [possible routes for anti-abuse
systems](/blog/possible-distributed-anti-abuse/). One of the goofier
routes I wrote about on there discussed genetic programming. I get the
sense that few people believe I could be serious... in some ways, I'm
not sure if I myself am serious. But the idea is so *alluring!* (And,
let's be honest, entertaining!) Imagine if you had anti-abuse programs
on your computer, and they're growing and evolving based on user
feedback (hand-waving aside exactly what that feedback is, which might
be the hardest problem), adapting to new threats somewhat invisibly from
the user benefiting from them. They have a set of friends who have
similar needs and concerns, and so their programs propagate and
reproduce with programs in their trust network (along with their
datasets, which may be taught to child programs also via a genetic
program). Compelling! Would it work? I dunno.

A different, fun use case I can't get to leave my mind is genetic
programs as enemy AI in roguelikes. After all, [cellular
automata](https://en.wikipedia.org/wiki/Cellular_automaton) are a class
of programs frequently used to study genetic programs. And roguelikes
aren't tooooooo far away from cellular automata where you beat things
up. What if the roguelike adapted to you? Heck, maybe you could even
collect and pit your genetic program roguelike monsters against your
friends'. Roguelike Pokémon! (Except unlike in Pokémon, "evolving"
actually really means "evolving".)

# Speculating on the future with Lee Spector

Anyway, how did I get on this crazy kick? On the way back from
LibrePlanet (which went quite well, and deserves its own blogpost), I
had the good fortune to be able to meet up with [Lee
Spector](http://faculty.hampshire.edu/lspector/). I had heard of Lee
because my friend Bassam Kurdali works in the same building as him in
Hampshire College, and Bassam had told me a few years ago about Lee's
work on genetic algorithms. The system Lee Spector works on is called
[PushGP](http://faculty.hampshire.edu/lspector/push.html) (Push is the
stack language, and PushGP is Push used for genetic programming). Well,
of course once I found out that Push was a lisp-based language, I was
intrigued (hosted on lisps traditionally, but not always, and Push
itself is kind of like
[Lisp](https://en.wikipedia.org/wiki/Lisp_%28programming_language%29)
meets
[Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29)),
and so by the time we met up I was relatively familiar with Lee's work.

Lee and I met up at the Haymarket Cafe, which is a friendly coffee shop
in Northampton. I mentioned that I had just come from LibrePlanet where
I had given a talk on [The Lisp Machine and
GNU](https://media.libreplanet.org/u/libreplanet/m/the-lisp-machine-and-gnu/).
I was entertained that almost immediately after these words left my
mouth, Lee dove into his personal experiences with lisp machines, and
his longing for the kind of development experiences lisp machines gave
you, which he hasn't been able to find since. That's kind of an aside
from this blogpost I suppose, but it was nice that we had something
immediately to connect on, including on a topic I had recently been
exploring and talking about myself. Anyway, the conversation was pretty
wild and wide-ranging.

I had also had the good fortune to speak to [Gerald
Sussman](https://groups.csail.mit.edu/mac/users/gjs/) again at
LibrePlanet this year (he also showed up to [my
talk](https://media.libreplanet.org/u/libreplanet/m/the-lisp-machine-and-gnu/)
and answered some questions for the audience). One thing I observed in
talking with both Sussman and Spector is they're both very interested in
thinking about where to bring computing in terms of examining biological
systems, but there was a big difference in terms of their ideas; Sussman
is very interested in holding machines "accountable", which seems to
frequently also mean being able to examine how they came to conclusions.
(You can see more of Sussman's thinking about this in the writeup I did
of [the first time I ever got to speak with
Sussman](/blog/sussman-on-ai/) when I was at FSF's 30th anniversary
party... maybe I should try to capture some information about the most
recent chat too, before it gets lost to the sands of time...) Spector,
on the other hand, seems convinced that to make it to the next level of
computing (and maybe even the next level of humanity), we have to be
willing to give up on demanding that we can truly understand a system,
and that we have to allow processes to run wild and develop into their
own things. There's a tinge of [Vernor Vinge's original vision of the
Singularity](https://www-rohan.sdsu.edu/faculty/vinge/misc/singularity.html),
which is that there's a more advanced level of intelligence than humans
are able to currently comprehend, and to understand it there you have to
have crossed that boundary yourself, like the impossibility of seeing
past the [event horizon](https://en.wikipedia.org/wiki/Event_horizon) of
a black hole. (Note that this is a pretty different definition of
singularity from some of the current definitions of Singularity that
have come since, which are more about possible outcomes of such a change
rather than about the concept of an intellectual/technological event
horizon itself.) That's a possible vision for the future of humanity,
but it's also a vision of what maybe the right direction is for our
programming too.

Of course, this vision that code may be generative in a way where the
source is mostly unintelligible to us feels possibly at odds to our
current understanding of software freedom, a movement which spends a lot
of time talking about inspecting source code. The implications of that
is a topic of interest of mine (I [wrote about it in the FSF bulletin
not too long
ago](https://www.fsf.org/bulletin/2016/spring/user-freedom-in-the-age-of-computer-generated-software))
and I did needle Spector about it... what does copyright mean in a world
where humans aren't writing software? Spector seems to acknowledge that
it's a concern (he agrees that examples of seed-DRM and genetic patents
in the case of Monsanto are troubling) but I got the sense that it's not
his biggest interest... Spector thinks that the future of that side of
software freedom and copyright might be that humanity realizes how
absurd software copyright and patents and other intellectual restriction
regimes are once things generate far enough. But I get the sense that
more than talking about the legal/licensing aspects of the
auto-generative future, Spector would rather be *building* it. Fair
enough!

Anyway, somewhere along these lines I mentioned my interest in
distributed anti-abuse systems. We talked about how more basic
approaches such as Bayesian filtering might not be good enough to combat
modern abuse beyond just spam especially, because the attack patterns
taken change so frequently. Suddenly it hit me: I wonder whether or not
genetic programs would work pretty well in a distributed system... after
all, you could use your web of trust to breed the appropriate filtering
programs with your friends' programs... would it work?

Anyway, on the ride back I began playing with some of Push's ideas, and
(with a lot of helpful feedback from Lee Spector... thank you, Lee!) I
started to put together a toy design for a language inspired by PushGP
but with some properties that I think might be more applicable to an
anti-abuse system that needs to keep around "memories" between
generations. (Whether it's better or not, I don't really know yet.)
So...

# A little Gush tutorial

At this point, [Gush](https://gitlab.com/dustyweb/gush) exists. It has
the stack based language down, but none of the genetic programming.
Nonetheless, it's fun to hack around in, looks an awful lot like Push
but also is far enough along to demonstrate its differences, and if
you've never played with a stack based language before, it might be a
good place to start.

Let's do some fun things. First of all, what does a Push program look
like?

``` scheme
(1 2 + dup *)
```

Ok, that looks an awful lot like a lisp program, and yet not at the same
time! If you've installed Gush, you can run this example:

``` scheme
> (use-modules (gush))
> (run '(1 2 + dup *))
$1 = (9)
```

Whoo, our program ran! But what happened? Gush programs operate on two
primary stacks... there's an "exec" stack, which contains the program
being evaluated in progress, and a "values" stack, with all the values
currently built up by the program. Evaluation happens like so:

``` scheme
exec> '(1 2 + dup *)  ; initial exec stack
vals> '()             ; initial empty values stack
exec> '(2 + dup *)    ; [=> 1] popped off from exec stack
vals> '(1)            ; push 1 (a literal) onto values stack
exec> '(+ dup *)      ; [=> 2] popped off from exec stack
vals> '(2 1)          ; push 2 (a literal) onto values stack
exec> '(dup *)        ; [=> +] popped off of exec stack
vals> '(3)            ; apply `+', which is bound to an operation which takes
                      ;   top two numbers on the values stack and adds them,
                      ;   pushing result onto values stack... 2 + 1 = 3,
                      ;   so 2 and 1 are removed and 3 is added
exec> '(*)            ; [=> dup] popped off of exec stack
vals> '(3 3)          ; `dup' takes top item on stack and duplicates it
exec> '()             ; [=> *] popped off of exec stack
vals> '(9)            ; apply `*', which is bound to an operation which takes
                      ;   top two numbers on values stack and multiplies them,
                      ;   pushing result onto values stack... 3 * 3 = 9,
                      ;   so 3 and 3 are removed and 9 is added
                      ; Nothing left to do on exec, so we're done!
```

Okay, great! What else can we run?

``` scheme
;; Complicated arithmetic runs
> (run '(3 2 / 4 +))
$2 = (14/3)

;; We can assign variables to values and then reference them
> (run '(88 'foo define foo 22 +))
$3 = (110)

;; However, base operations "know" what types to apply for, and search
;; the stack... the string will be "skipped over" in search for
;; a number here.  This means that we can randomly generate code
;; and we won't run into type errors.
> (run '(1 "two" 3 +))
$4 = (4 "two")   

;; Nested parentheses will be "unnested" and applied inline
> (run '(98 ("balloons" "red") 1 +))
$5 = (99 "red" "balloons")
;; so that's the same as
> (run '(98 "balloons" "red" 1 +))
$6 = (99 "red" "balloons")

;; Variables are actually stacks!  Which means we can build up
;; complicated operations on them...
> (run '('+ 'foo define      ; set foo to '(+)
         88 'foo var-push    ; append 88, so foo is now '(88 +)
         2 foo))             ; apply variable stack foo
$7 = (90)

;; Conditionals, etc also work.
> (run '(1 1 + 'b define  ; assign b to the value of 1 + 1
         2 b = if         ; check if b is 2
           "two b"        ; if-then clause
           "not two b"))  ; if-else clause
$8 = ("two b")
```

There's more to it than that, but that should get you started.

# How is Gush different from Push?

Gush takes almost all its good ideas from Push, but there are two big
differences.

Both Gush and Push try to avoid type errors. You can do all sorts of
code mutation, and whether or not things will actually do anything
useful is up for grabs, but it shouldn't crash to a halt. The way Push
does it is via different stacks for each value type. This is really
clever: it means that each operation applies to very specific types, and
if you always know your input types carefully, you can always be safe on
a type level and shouldn't have programs that unexpectedly crash (if
there aren't enough values on the appropriate type stack, Push just
no-ops).

But what if you want to run operations that might apply to more than one
type? For example, in Gush you might do:

``` scheme
> (run '(1 2 / 4.5 +))
$9 = (6.5)
```

In Push, you'd probably do something like this:

``` scheme
> (run '(1 2 INTEGER./ FLOAT.FROMINTEGER 4.5 FLOAT.+))
$9 = (6.5)
```

(And of course, if you wanted rational numbers rather than just floats,
you'd have to add another type stack to that...)

I really wanted generic methods that were able to determine what types
they were able to apply to. For one thing, imagine you have a program
that's doing a lot of complicated algebra... it *should* be able to
operate on a succession of numbers without having to do type coercion
and hit/miss on whether it chose the right of several typed operators,
when it could just pick one operator that can apply to several items.

I also wanted to be able to add new types without much difficulty. As it
is, I don't have to rewire anything to throw hash-maps into Gush:

``` scheme
> (run '(42 "meaning of life" make-hash hash-set))
$10 = (<hash-table>)
```

This would just work, no need to wire anything new up!

The way Gush does it is it uses generic operators which know how to
check the predicates for each type, and which "search the stack" for
values it knows it can apply. (It also no-ops if it can't find
anything.) If bells and alarms are going off, you're not wrong! In
Gush's current implementation, this does have the consequence that any
given operation might be worst case O(n) of the size of the values
stack! Owch! However, I'm not too worried. Gush checks how many
operations every program takes (and has the option to bail out if a
program is taking too many steps) and searching the stack after failing
to match initially counts against a program. I figured that if programs
are auto-generated, one fitness check can be how many steps it takes for
the program to finish its computation, and so programs would be
incentivized to keep the appropriate types near where they would be
useful. I'm happy to say that it turns out I'm not the only one to think
this; unknown to me when I started down this path, there's another Push
derivative named
[Push-Forth](https://www.lri.fr/~hansen/proceedings/2013/GECCO/companion/p1635.pdf)
which has only one stack altogether (not even separate stacks for exec
and values!) and it does some similar-ish (but not quite the same)
searching (or converging on a fixed point) by currying operations until
the appropriate types are available. (Pretty cool stuff, but to be
honest I have a hard time following the Push-Forth examples I've seen.)
It comes to the same conclusion that by checking the number of steps a
program takes to execute as part of its fitness, programs will be
encouraged to keep types in good places anyway. However! There's more
reasons to not despair; I'm relatively sure that there are some clever
things that can be done with Gush's value stack so that predicate
information is cached and looking for the right type can be made O(1).
That has yet to be proven though. :)

The other feature Gush differentiates itself from Push is that Gush
variables are stacks rather than single values. This ties in nicely with
the classic Push approach that lists are unwrapped and applied to the
exec stack at the time they are to be evaluated anyway, so it makes no
behavioral change in the case that you just use "define" (which will
always clobber the state of the stack, whether or not it exists, to
replace it with a stack with a single element of the new value). But it
also allows you to build up collections of information over time... or
even collections of code. An individual variable can be appended to and
modified as the program runs, so you could write or even modify
subroutines to variables. (Code that writes code! Very lispy, but also a
bit crazier because it might happen at runtime.) Push also has this
feature, but it has one specific, restricted stack for it, named the
CODE stack appropriately. Why have one of these stacks, when you could
have an unlimited number of them?

That wasn't the original intent for having variables as values though; I
only realized that you could make each variable into a kind of CODE
stack later. My original intent was driven by a concern/need to be able
to carry information from parent process to child process. I added a
structure to Gush programs named "memories", and I figured that parent
programs could "teach" their memories to child programs. So this was
really just a hash table of symbols to stacks that persisted after the
program ended (which, since if you use run-application you get the whole
state of the program as the same immutable structure that is folded over
during execution, you have that information attached to the application
anyway). The idea of "memories" was that parent programs could have
another program that, after spawning a child program, could "teach" the
things they knew to their children (possibly either by simply copying,
or more likely through a separate genetic program applied to that same
data). That way a database of accrued information could be passed around
from generation to generation... a type of genetic programming
educational system (or folklore). So that was there, but then when I
began adding variables around the same time I realized that a variable
that contained a single value and which was pushed onto to the exec
stack was, due to the way Push "unwraps" lists, exactly the same as if
there was just that variable alone pushed onto the stack. Plus, it
seemed to open up more paths by having the cool effect of having any
variable be able to take on the power of Push's CODE stack. (Not to
mention, removing the need for a redundant CODE stack!)

Are these really improvements? I don't know, it's hard to say without
actually testing with some genetic programming examples. That part
doesn't exist yet in Gush... probably I'll follow the current lead of
the Push community and do mutation on the [linearized Plush
representation of Push
code](https://push-language.hampshire.edu/t/plush-genomes/279).

Anyway, I also want to give a huge thank you to Lee Spector. Lee has
been really patient in answering a lot of questions, and even in the
case that Gush does have some improvements, they're minor tweaks
compared to the years of work and experimentation that has gone into the
Push/PushGP designs.

And hey, it was a lot of fun! Not to mention, a great way to
procrastinate on the things I *should* be working on...
